Truth
Prediction Continuum Room For Squares
Continuum 2 1
Room For Squares 4 12
The Search for Everything 6 0
Truth
Prediction The Search for Everything
Continuum 2
Room For Squares 2
The Search for Everything 8
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.595
2 kap multiclass 0.388
3 j_index macro 0.382
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.568
2 kap multiclass 0.349
3 j_index macro 0.341
$Continuum
35 x 1 sparse Matrix of class "dgCMatrix"
1
(Intercept) -0.5294009
danceability -0.6219245
energy .
loudness -2.8225666
speechiness 0.5421036
acousticness .
instrumentalness .
liveness .
valence .
tempo 1.5544615
duration 0.5740248
C .
`C#|Db` .
D .
`D#|Eb` 1.5222356
E .
F .
`F#|Gb` -1.7037551
G .
`G#|Ab` .
A 0.9029103
`A#|Bb` .
B .
c01 -5.2891822
c02 .
c03 .
c04 -0.3137720
c05 .
c06 .
c07 .
c08 .
c09 .
c10 4.6280590
c11 .
c12 .
$`Room For Squares`
35 x 1 sparse Matrix of class "dgCMatrix"
1
(Intercept) -0.01477458
danceability .
energy .
loudness .
speechiness .
acousticness -0.91673341
instrumentalness .
liveness .
valence .
tempo .
duration .
C .
`C#|Db` 0.22777826
D .
`D#|Eb` .
E .
F .
`F#|Gb` .
G .
`G#|Ab` .
A .
`A#|Bb` .
B .
c01 .
c02 .
c03 .
c04 .
c05 .
c06 .
c07 .
c08 .
c09 -4.79879773
c10 .
c11 .
c12 .
$`The Search for Everything`
35 x 1 sparse Matrix of class "dgCMatrix"
1
(Intercept) 0.54417548
danceability .
energy .
loudness .
speechiness .
acousticness 2.73740138
instrumentalness 0.22978022
liveness .
valence 0.19828772
tempo .
duration .
C .
`C#|Db` .
D .
`D#|Eb` -4.13621947
E .
F 1.42196539
`F#|Gb` .
G .
`G#|Ab` .
A -0.51648033
`A#|Bb` 3.17780521
B .
c01 .
c02 .
c03 .
c04 0.33928402
c05 .
c06 4.87299251
c07 -0.06887533
c08 .
c09 .
c10 .
c11 .
c12 .
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.649
2 kap multiclass 0.471
3 j_index macro 0.466
Call:
C5.0.default(x = x, y = y, trials = 1, control = C50::C5.0Control(minCases =
2, sample = 0))
C5.0 [Release 2.07 GPL Edition] Sun Mar 22 23:58:18 2020
-------------------------------
Class specified by attribute `outcome'
Read 37 cases (35 attributes) from undefined.data
Decision tree:
c09 <= -0.1415127: Room For Squares (14/1)
c09 > -0.1415127:
:...`D#\|Eb` <= -1.062942: The Search for Everything (4)
`D#\|Eb` > -1.062942:
:...liveness <= -0.3603865: Continuum (10/1)
liveness > -0.3603865:
:...liveness <= 0.567418: The Search for Everything (6)
liveness > 0.567418: Continuum (3)
Evaluation on training data (37 cases):
Decision Tree
----------------
Size Errors
5 2( 5.4%) <<
(a) (b) (c) <-classified as
---- ---- ----
12 (a): class Continuum
13 (b): class Room For Squares
1 1 10 (c): class The Search for Everything
Attribute usage:
100.00% c09
62.16% `D#\|Eb`
51.35% liveness
Time: 0.0 secs
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.676
2 kap multiclass 0.512
3 j_index macro 0.506
# A tibble: 3 x 3
.metric .estimator .estimate
<chr> <chr> <dbl>
1 accuracy multiclass 0.595
2 kap multiclass 0.389
3 j_index macro 0.383
These are the mosaic and the heatmap for three distinctive John Mayer albums. Room For Squares is the first album, which also contains the biggest outlier for track popularity. The Search for Everything is the most recent album. And Continuum is somewhere in between and contains three very popular tracks. The truth and prediction of these three albums are being compared here. Since it took me a while to get everything to work properly, I did not have much time left to have a better look at what information can be found in these maps/plots or to include more information about the other plots.
K-means clustering with 4 clusters of sizes 2, 3, 4, 3
Cluster means:
danceability energy loudness speechiness acousticness instrumentalness
1 0.4483754 -0.5055769 -0.8542989 -0.2617314 0.4311570 -0.3560762
2 1.1628505 -0.3483975 -0.5210990 -0.3347407 0.3168866 -0.3472233
3 -0.6672185 1.2108816 1.1846645 0.6608718 -1.2083289 0.3202983
4 -0.5721427 -0.9290601 -0.4889211 -0.3719341 1.0067805 0.1575429
liveness valence tempo duration C C#|Db
1 0.72504390 -0.5811429 -0.4569806 0.3095383 -0.6452313 -0.06133185
2 -0.49929630 0.5065896 -0.4990188 0.2360117 0.4125469 0.23265043
3 -0.01787979 0.5472212 0.4486234 0.3466649 -0.4925857 0.28181579
4 0.03977342 -0.8487893 0.2055080 -0.9045904 0.6743883 -0.56751692
D D#|Eb E F F#|Gb G
1 -0.33204926 -0.7399263 0.09467322 1.30255090 1.9057041 0.3395602
2 -0.01792557 1.2025659 -1.31851366 0.04740666 -0.8564878 -0.5684698
3 -0.08188922 -0.1916162 0.05872010 -0.16745673 -0.2575904 -0.5981355
4 0.34847737 -0.4537935 1.17710471 -0.69249828 -0.0705277 1.1396102
G#|Ab A A#|Bb B c01 c02
1 -0.65154940 -0.46684298 -0.6774957 -0.11841796 -1.0827059 -0.3413168
2 1.18316804 -0.32349614 0.4445460 -0.90263985 -0.3498639 -0.3604129
3 -0.08164101 0.09139234 0.6927170 0.75363901 1.1463431 1.0878198
4 -0.63994710 0.51286834 -0.9165048 -0.02326685 -0.4567895 -0.8624690
c03 c04 c05 c06 c07 c08 c09
1 -0.8658813 -1.0580524 0.7167423 -0.6281798 0.1946347 -0.8353770 1.1322554
2 -0.9037814 -0.4057427 0.1770408 1.0089714 -0.4854060 -0.7882692 0.2083610
3 0.3173946 0.4510403 -0.9921773 -0.6707792 0.6896874 0.7166807 -0.2957887
4 1.0578427 0.5097239 0.6680341 0.3041874 -0.5639337 0.3896130 -0.5688130
c10 c11 c12
1 -0.3699948 0.2827328 1.43603416
2 -0.3740679 -1.2417615 0.04227002
3 1.0470441 0.3304235 -0.36506877
4 -0.7753277 0.6127083 -0.51286777
Clustering vector:
Waiting On the Wo... I Don't Trust Mys... Belief
3 2 3
Gravity The Heart of Life Vultures
1 4 2
Stop This Train Slow Dancing in a... Bold as Love
1 2 3
Dreaming with a B... In Repair I'm Gonna Find An...
4 3 4
Within cluster sum of squares by cluster:
[1] 22.16219 56.02710 71.22445 40.35515
(between_SS / total_SS = 49.3 %)
Available components:
[1] "cluster" "centers" "totss" "withinss" "tot.withinss"
[6] "betweenss" "size" "iter" "ifault"
[1] 432.0 345.6
Verse: 87-107 sec
Pre-chorus: 108-126 sec
Chorus: 127-143 sec
Bridge: 144-182 sec
What can be noticed about this novelty function is that the chorus actually has the smallest peaks. The verse already shows some more peaks, but the bridge shows even more than the verse. However, the most and the highest peaks can be found after the bridge. This is quite remarkable, since this is nothing special in this song. It is just a short intermezzo which builds up to the next chorus. The chromagram and timbre cepstrogram do not show many important details about the structure and melody of this song. However the chromagram does clearly show that the D (and in more detail, the D-minor chord) is very present in the bridge section.
Unfortunately, the code for my tempogram stopped working when I got home and tried to work on the assignment. If I could get the tempogram to work, I might include it in my portfolio.
Since I could not get the code of my tempograms to work, I figured it would be nice to include a histogram of the most popular song on every album. The most popular song on Room For Squares, of course, is Your Body Is a Wonderland. The most popular song on Heavier Things is Daughters. The most popular song on Continuum is Slow Dancing in a Burning Room. The most popular song on Battle Studies is Who Says. The most popular song on Born and Raised is Queen of California. The most popular song on Paradise Valley is Who You Love. And the most popular song on The Search for Everything is Love on the Weekend. This tempo histogram shows that most of the tracks have a tempo of 120 bpm (beats per minute) or higher (120-150 bpm). However, there are some tracks with a lower tempo. This tempo is somewhere in between 80 and 100 bpm. The most popular track of all, Your Body Is a Wonderland, has a tempo of 94 bpm, which is not as high as the tempo of most popular songs. The tempo does not really seem to differ for the oldness or newness of the tracks.
The second histogram shows the fifteen (overall) most popular John Mayer songs. This visualisation could be divided into three different parts. Six of the most popular tracks have a tempo of less than 100 beats per minute (bpm). Seven of the tracks seem to have a tempo in between 115 and 145 bpm. The tempo of the other two tracks seems to be around 175 bpm. This shows that the most of the songs are somewhere in between 115 and 145 bpm, and some could have a tempo of 100 bpm or less. But a tempo higher than 150 bpm is not very common for the most popular John Mayer songs.
This visualisation shows the variation in tempo for the first album, Room For Squares, and the most recent album, The Search for Everything. The x-axis shows the mean tempo in beats per minute (bpm). The y-axis shows the standard deviation of the tempo. The size of the points shows the duration of the song, the color shows which song belongs to which album. And the transparency shows the loudness of the songs in dBFS. It seems that the tracks on The Search for Everything seem to be more clustered with some outliers on the y-axis, whereas the tracks on Room For Squares seem te be a bit more divided. Besides this, it can be noticed that there isn’t that big of a difference for the duration of the tracks. The volume, however, does show more of a difference. The difference in volume between the two albums does not seem that big, but the difference between the tracks on Room For Squares seems to be a bit clearer.
I chose three different examples to show the keygrams and chordograms of these tracks. The first track is You Body Is a Wonderland which is the biggest outlier and also the most popular track on the first album, Room For Squares. The second track is the second most popular track, which is Slow Dancing in a Burning Room from the album Continuum. The last track is the most popular track on the most recent album, The Search for Everything.
What is very noticeable here is that the keygrams and chordograms don’t really show the right key or chords of the song. Your Body Is a Wonderland is in the key of F major, however according to the keygram it should be in D major. In the keygram and chordogram of Your Body Is a Wonderland, there is a strange section somewhere between 140 and 180 seconds. This section is the bridge and only one chord is played here, which could be why it might have been a bit difficult to see the right key here.
However, Your Body Is a Wonderland is not the only track where the keygram does not show the right key. The same thing happens with Slow Dancing in a Burning Room. The keygram clearly shows a darker line for the key of F# minor, while this song is actually in the key of C# minor. What can be noticed about this track is that there appear to be two sections (60-70 and 125-135 seconds) where the key seems to change. These sections are short intermezzos where the backing vocals sing and th instruments play a G#-minor chord, which explains the change in keys for these sections.
The last keygram and chordogram show the key and chords of the track In the Blood, which is on the most recent album. The darkest line in the keygram seems to be the key of B-minor. However, there are two other darker lines on the keys of B-major and Ab-major. This track is not in the key of B-minor, however, it is in the key of Ab-major. So, it appears to be that the keygram matches the most for this track. The chordogram shows the darkest lines for the chords B-major, B7, Ab-major and Ab7. Unfortunately, the chords B-major and B7 are nowhere to be found in this track. However, since the song is in the key of Ab-major, the Ab-major chord and some variations of this chord can be found in the song.
The first visualisation is a timbre cepstrogram of the biggest outlier, which is the track Your Body Is a Wonderland on the album Room For Squares. The magnitude seems to be the highest for the two lowest bars.
The second visualisation is a chromagram of Your Body Is a Wonderland, the most popular track (with a track popularity of 76) out of all John Mayer albums. It shows that the song is in the key of F-major. The chords C and F are often played in this song, which can also be seen in this chromagram. Something else that can be noticed is that a small part of D shows more magnitude. This part is the bridge, where the D-minor chord is played, which can clearly be seen in the chromagram.
These are three self-similarity matrices. The first self-similarity matrix is of the biggest outlier, Your Body Is a Wonderland. This self-similarity matrix shows one clear diagonal line. This line indicates the music in time. The next noticeable element are the two more yellow (horizontal and vertical) lines which form some kind of window pane. This indicates that something unexpected happens at this point in time. This means that the overall sound changes a lot. The last element is actually not very noticeable. This element is the homogeneity and this should show different segments in the song that sound similar.
The second self-similarity matrix shows the second most popular track, which is “Slow Dancing in a Burning Room” (from the album “Continuum”). The two biggest differences between the self-similarity matrix of Your Body Is a Wonderland and the self-similarity matrix of Slow Dancing in a Burning Room is that the change of sound (possibly the bridge) takes place earlier in Slow Dancing in a Burning Room and that the checkerboard pattern can be seen more clearly in the self-similarity matrix of Slow Dancing in a Burning Room.
The third self-similarity matrix shows the most popular song of the most recent album, which is “In the Blood”. The diagonal line shows the music unfolding in time. The more yellow (horizontal and vertical) lines, which appear somewhere between 150 and 200 seconds, mark a change in music. In this case it shows a small guitar solo or bridge in the song. The checkerboard pattern in this self-similarity matrix looks clearer compared to the other two sel-similarity matrices. This pattern shows the homogeneity which means that it shows certain segments which sound similar. The checkerboard pattern therefore indicates that there can be found multiple similar sounding segments in this track.
For this corpus idea, I wanted to compare the older John Mayer albums to the newer John Mayer albums. The first thing I did here was compare only the first and last album to see what kind of differences I could find here, especially in the popularity and the valence of the tracks. The next step was to make a scatterplot that easily shows these differences. I finally made this into an interactive scatterplot to be able to show more of the information and to show which song is which. However, this scatterplot would only show two of the total of seven albums. So, my next step was to include all of the albums and make good visuals to show as much information as possible of every album. The first visual is a boxplot. I made boxplots of every album and put them next to each other to be able to compare all of them. What I compared here is only the track popularity. Since this would not be enough information, I also made a scatterplot. This scatterplot would show the albums in different colors, the energy, the valence, and of course the track popularity. My goal here is to try and find out if the older or newer albums of John Mayer are more popular, and if so, why these would be more popular. With all of the information and the visuals, I hope to find an answer to these questions.
So, I will use boxplots and interactive scatterplots to try to find more information about the track popularity, the valence, the energy, the mode, and the loudness of John Mayer’s albums. With this information I hope to find an answer to whether the older or newer albums are more popular and why.
This scatterplot clearly shows the track popularity, the valence, the loudness and the mode of the tracks. The track popularity is shown on the Y-axis, the X-axis shows the valence, the color of the dots shows the mode and the size of the dots shows the loudness. Since this is an interactive scatterplot, it is also easier to see which song has the highest popularity, valence or loudness and in which mode the song is written. What is very noticeable here is that the range of the track popularity of the most recent album (The Search For Everything) is much narrower compared to his first album (Room For Squares). What is also easily noticeable is that the first album has a much bigger outlier compared to the second album. However, even with these visuals, it still remains hard to find a real connection between the popularity, the valence, the mode or the loudness. That is why I also looked up the mean and standard deviation of nine different features that could be important for the popularity of the albums.
| Room For Squares | Mean | Standard Deviation |
|---|---|---|
| Danceability | 0.609 | 0.0637 |
| Energy | 0.676 | 0.130 |
| Valence | 0.476 | 0.167 |
| Mode | 0.769 | 0.439 |
| Key | 4.08 | 2.56 |
| Tempo | 106. | 20.9 |
| Instrumentalness | 0.00818 | 0.0158 |
| Speechiness | 0.0283 | 0.00378 |
| Track Popularity | 53.5 | 9.79 |
| The Search For Everything | Mean | Standard Deviation |
|---|---|---|
| Danceability | 0.642 | 0.141 |
| Energy | 0.465 | 0.167 |
| Valence | 0.498 | 0.226 |
| Mode | 0.833 | 0.389 |
| Key | 5.67 | 3.37 |
| Tempo | 126. | 32.3 |
| Instrumentalness | 0.0836 | 0.268 |
| Speechiness | 0.0338 | 0.0121 |
| Track Popularity | 60.9 | 3.32 |
It can be noticed that there are some differences between these two playlists. The first difference is the energy level. The energy level seems higher for the Room For Squares album. The second difference is the tempo, which seems to be higher for the most recent album, The Search For Everything. The third noticeable difference is that the instrumentalness is higher for the The Search For Everything album. The last, and for this corpus probably the most important difference is that the track popularity is higher for The Search For Everything. This shows that, even though the most popular song is on the Room For Squares album, The Search For Everything seems to have an overall higer popularity.
These boxplots show the popularity of every John Mayer album. These boxplots show that the albums “Continuum” and “The Search for Everything” seem to contain the most popular tracks. However, it also shows that Continuum contains slightly more popular tracks than The Search for Everything, which is why, according to these boxplots, it can be said that Continuum is the most popular John Mayer album. This album is from the year 2006, which would mean that his older music is more popular than his newer music, however I do think that the success of this album does not have much to do with the oldness or newness, but more with the guitar skills in the tracks of this album.
This interactive scatterplot clearly shows all of the albums in different colors. Because of the different colors it is easy to see which albums seem to be more popular and which album contains most outliers for example. In this scatterplot, the valence is shown on the x-axis and the track popularity is shown on the y-axis. The size of the dots in this scatterplot shows the energy. In this interactive scatterplot it can easily be seen that the biggest outlier (mostly in popularity) is the track “Your Body Is a Wonderland” of the album Room For Squares, which also is the oldest album in the scatterplot. The other three ouliers, however, are the tracks “Slow Dancing In A Burning Room”, “Gravity”, and “Waiting On the World to Change”, which are all on the album Continuum. Even though, these seem to be the most popular tracks, it does not show much about the valence or loudness for example. It seems that there is no real connection between the track popularity and the valence and/or energy.
From these visualisations it seems that, even though the biggest outlier is on the Room For Squares album, the Continuum album is the most popular album out of all John Mayer albums. However, it is still uncertain why this is the most popular album. The results of the valence and energy levels, unfortunately, do not show why this album seems to be the most popular. There are not really noticeable differences in valence or energy in the tracks of this album, compared to the tracks of other albums. However, I think that the high popularity of this album is mostly due to the great guitar skills and the guitarsolos. But unfortunately, this remains unknown.